汇总表格时将所有小的组合并为一个组

news
Author

Tony Duan

Published

October 10, 2022

Code
library(tidyverse)

汇总表格

Code
head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

按照1个变量hp汇总。算了观察数量row_record,drat总数,drat总数占比

Code
mtcars %>% group_by(hp) %>% summarise(row_record=n(),total_drat=sum(drat)) %>% 
  mutate(total_drat_share = round(total_drat / sum(total_drat),2))
# A tibble: 22 × 4
      hp row_record total_drat total_drat_share
   <dbl>      <int>      <dbl>            <dbl>
 1    52          1       4.93             0.04
 2    62          1       3.69             0.03
 3    65          1       4.22             0.04
 4    66          2       8.16             0.07
 5    91          1       4.43             0.04
 6    93          1       3.85             0.03
 7    95          1       3.92             0.03
 8    97          1       3.7              0.03
 9   105          1       2.76             0.02
10   109          1       4.11             0.04
# … with 12 more rows

只保留前3组

Code
mtcars %>%  mutate(new_hp=as.factor(hp) %>%fct_lump_n(3, other_level = "Other")) %>%
  group_by(new_hp) %>% summarise(row_record=n()) %>% arrange(desc(row_record))
# A tibble: 4 × 2
  new_hp row_record
  <fct>       <int>
1 Other          23
2 110             3
3 175             3
4 180             3

只保留出现过2次以及以上的

Code
mtcars %>%  mutate(new_hp=as.factor(hp) %>%fct_lump_min(2, other_level = "Other")) %>%
  group_by(new_hp) %>% summarise(row_record=n(),total_wt=sum(wt)) %>% arrange(desc(row_record))
# A tibble: 8 × 3
  new_hp row_record total_wt
  <fct>       <int>    <dbl>
1 Other          15    47.2 
2 110             3     8.71
3 175             3    10.1 
4 180             3    11.6 
5 66              2     4.14
6 123             2     6.88
7 150             2     6.96
8 245             2     7.41
Code
mtcars %>% add_count(hp, wt = drat)%>% add_count(hp)
    mpg cyl  disp  hp drat    wt  qsec vs am gear carb     n nn
1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 10.88  3
2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 10.88  3
3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1  3.85  1
4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 10.88  3
5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  9.85  3
6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  2.76  1
7  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4  6.94  2
8  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2  3.69  1
9  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2  3.92  1
10 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4  7.84  2
11 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4  7.84  2
12 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3  9.21  3
13 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3  9.21  3
14 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3  9.21  3
15 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4  2.93  1
16 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4  3.00  1
17 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4  3.23  1
18 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1  8.16  2
19 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2  4.93  1
20 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1  4.22  1
21 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1  3.70  1
22 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2  5.91  2
23 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2  5.91  2
24 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4  6.94  2
25 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2  9.85  3
26 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1  8.16  2
27 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2  4.43  1
28 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2  3.77  1
29 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4  4.22  1
30 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6  9.85  3
31 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8  3.54  1
32 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2  4.11  1